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Abstract 

Wireless sensor networks are often used for environmental monitoring applications. In this context 
sampling and reconstruction of a physical field is one of the most important problems to solve. We focus 
on a bandlimited field and find under which conditions on the network topology the reconstruction of 
the field is successful, with a given probability. We review irregular sampling theory, and analyze the 



> 
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problem using random matrix theory. We show that even a very irregular spatial distribution of sensors 



may lead to a successful signal reconstruction, provided that the number of collected samples is large 



enough with respect to the field bandwidth. Furthermore, we give the basis to analytically determine 
• the probability of successful field reconstruction. 

Keywords: Irregular sampling, random matrices, Toeplitz matrix, eigenvalue distribution. 

I. Introduction 

One of the most popular applications of wireless sensor networks is environmental monitoring. 
In general, a physical phenomenon (hereinafter also called sensor field or physical field) may 
vary over both space and time, with some band limitation in both domains. In this work, we 
address the problem of sampling and reconstruction of a spatial field at a fixed time instant. 
We focus on a bandlimited field (e.g., pressure and temperature), and assume that sensors are 
randomly deployed over a geographical area to sample the phenomenon of interest. 

This work was supported through the PATTERN project 
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Data are transfered from the sensors to a common data-collecting unit, the so-called sink node. 
In this work, however, we are concerned only with the reconstruction of the sensor field, and we 
do not address issues related to information transport. Thus, we assume that all data is correctly 
received at the sink node. Furthermore, we assume that the sensors have a sufficiently high 
precision so that the quantization error is negligible, and the sensors position is known at the 
sink node. The latter assumption implies that nodes are either located at pre-defined positions, 
or, if randomly deployed, their location can be acquired (see [8]— [10] for a description of node 
location methods in sensor networks). 

Our objective is to investigate the relation between the network topology and the probability 
of successful reconstruction of the field of interest. The success of the reconstruction algorithm 
strongly depends on the given machine precision, since it may fail to invert some ill-conditioned 
Toeplitz matrix (see Section Hill) . 

More specifically, we pose the following question: under which conditions on the network 
topology (i.e., on the sample distribution) the sink node successfully reconstructs the signal 
with a given probability? The solution to the problem seems to be hard to find, even under the 
simplifying assumptions we described above. 

The main contributions of our work are summarized below. 

( i) We first consider deterministic sensor locations. By reviewing irregular sampling theory [1], 
we show some sufficient conditions on the number of sensors to be deployed and on how 
they should be spatially spaced so as to successfully reconstruct the measured field. 

( ii) We then consider a random network topology and analyze the problem using random matrix 
theory. We identify the conditions under which the filed reconstruction is successful with a 
fixed probability, and we show that even a very irregular spatial distribution of sensors may 
lead to a successful signal reconstruction, provided that the number of collected samples 
is large enough with respect to the field bandwidth. 

(Hi) Finally we provide the theoretical basis to estimate the required number of active sensors, 
given the field bandwidth. 

II. Related work 

Few papers have addressed the problem of sampling and reconstruction in sensor networks. 
Efficient techniques for spatial sampling in sensor networks are proposed in [2], [3]. In particu- 
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lar [2] presents an algorithm to determine which sensor subsets should be selected to acquire data 
from an area of interest and which nodes should remain inactive to save energy. The algorithm 
chooses sensors in such a way that the node positions can be mapped into a blue noise binary 
pattern. In [3], an adaptive sampling is described, which allows the central data-collector to 
vary the number of active sensors, i.e., samples, according to the desired resolution level. Data 
acquisition is also studied in [4], where the authors consider a unidimensional field, uniformly 
sampled at the Nyquist frequency by low precision sensors. The authors show that the number 
of sensors (i.e., samples) can be traded-off with the precision of sensors. The problem of the 
reconstruction of a bandlimited signal from an irregular set of samples at unknown locations is 
addressed in [5]. There, different solution methods are proposed, and the conditions for which 
there exist multiple solutions or a unique solution are discussed. 

Note that our work significantly differs from the studies above because we assume that the 
sensors location are known (or can be determined [8]— [10]) and the sensor precision is sufficiently 
high so that the quantization error is negligible. The question we pose is instead under which 
conditions (on the network system) the reconstruction of a bandlimited signal is successful with 
a given probability. 

III. Irregular sampling of band-limited signals 

Let us consider the one-dimensional model where r sensors, located in the normalized interval 
[0, 1), measure the value of a band-limited signal pit). As a first step, we assume that the position 
of the sensors sampling the field are deterministic and known, and the sensors can represent each 
sample with a sufficient number of bits so that the quantization error is negligible. Let t q E [0, 1) 
for q — 1 . . . , r be the deterministic locations of the sampling points ordered increasingly and 
p(t g ) the corresponding samples. 

A strictly band-limited signal over the interval [0, 1) can be written as the weighted sum of 
M' harmonics in terms of Fourier series 

w 

P (t)= a ^ ikt (D 

k=-M> 

Note that for real valued signals the Fourier coefficients satisfy the relation a* k = a-k and that 
the series £0 can be represented as a sum of cosines. 

The reconstruction problem can be formulated as follows: 
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given r pairs [t q ,p{t q )) for q 



, r and t q G [0,1) find the band-limited signal in © 



uniquely specified by the sequence of its Fourier coefficients a k - 
Let the reconstructed signal be 



M 



p(t)= E 



Jlmkt 



(2) 



k=-M 



where the a k are the corresponding Fourier coefficients up to the M-th harmonic. In general, 
the reconstruction procedure will minimize \\p(t) — p(t)\\ 2 if M < M' and give p(t) = p(t) if 
M = M'. 

Consider the (2M + 1) x r matrix F whose (k, q)-th element is defined by 

1 2nikt k = ~M,...,M 

(F) M = ^=e 2mfc ^ 

V' q — 1, . . . , r 

the vector a = [a_ M , . . . ,a , . . . , o,m] t of size 2M + 1 and the vector 
P = \p(ti), ■ ■ ■ ,p(U)} T - We have the following linear system [1]: 

FF f a = Fp (3) 

where is the conjugate transpose operator. Let us denote T = FF^ and b = Fp, hence © 
becomes Ta = b and then a = T _1 b. 

When the samples are equally spaced in the interval JO, 1), i.e., t q = (q—l)/r, we observe that 
the matrix F is a unitary matrix (FF+ = T = I2M+1) □ and its rows are orthonormal vectors of 
an inverse DFT matrix. In this case © gives the first M Fourier coefficients of sample sequence 
P- 

When the samples t q are not equally spaced, the matrix F is no longer unitary and the matrix 
T becomes a (2M + 1) x (2M + 1) Hermitian Toeplitz matrix 

I r n ■■■ r 2M \ 
r_i r • ■ • r 2 M-i 



T t 



\ r-2M 



r J 



'The symbol I n represents the n by n identity matrix 
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where 



1 r 

T) fc , m = r k . m = -J2 e 2 ^ k - m »« k,m = -M...,M (4) 



r 

9=1 



The above Toeplitz matrix T is uniquely defined by the AM + 1 variables 



1 r 

r t = - z 2mUq t = -2M, . . . 2M (5) 

r 9=1 

The solution of ©, which involves the inversion of T, requires some care if the condition 
number of T (or equivalently of F) becomes large. We recall that the condition number of T is 
defined as 

Amax ,,r\ 

K = - — (6) 

^min 

where A max and A min are the largest and the smallest eigenvalues of T, respectively. The base- 10 
logarithm of k is an estimate of how many base- 10 digits are lost in solving a linear system 
with that matrix. 

In practice, matrix inversion is usually performed by algorithms which are very sensitive to 
small eigenvalues, especially when smaller than the machine precision. For this reason in [1] a 
preconditioning technique is used to guarantee a bounded condition number when the maximum 
separation between consecutive sampling points is not too large. More precisely, by defining 
w q = (t q+ i — t q -\)/2 for q = 1 . . . ,r, where t = t r — 1 and t r+ i = 1 + 1\, and by letting 
W = diag(wi, . . . , w r ), the preconditioned system becomes 

where T w = FWF^ and h w = FWp. Let us define the maximum gap between consecutive 
sampling points as 

5 = max(tq — tg-i). 
In [1] it is shown that, when 5 < l/2M,we have: 

K(TJ <ii±I^V (7) 



1 - 25M J 

This result generalizes the Nyquist sampling theorem to the case of irregular sampling, but only 
gives a sufficient condition for perfect reconstruction when the condition number is compatible 
with the machine precision. Unfortunately, when 5 > 1/2M, the result © does not hold. 
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In Figure \T\ and [2] we present two examples of reconstructed signals from irregular sampling, 
using ©. Figure Q] refers to the case M = 10 and r = 26, where the samples have been 
randomly selected over the interval [0,0.8). The signal is perfectly reconstructed even if large 
gaps are present (5 > 0.2, i.e., 5 > 1/2M). In Figure |2j r = 21 samples of the same signal of 
Figure [Qhave been taken randomly over the entire window [0, 1). Due to the bad conditioning 
of the matrix T (i.e., very low eigenvalues), the algorithm fails in reconstructing the signal due 
to machine precision underflow. 

Driven by these observations, the objective of our work is to provide conditions for the 
successful reconstruction of the sampled field, by using a probabilistic approach. In the following 
we give a probabilistic description of the condition number, without explicitly considering 
preconditioning. 

IV. The random matrix approach: unsuccessful signal reconstruction 

The above results are based on deterministic locations of the sampling points. In this section we 
discuss instead the case where the sampling points t q are i.i.d. random variables with uniform 
distribution U[0, 1). In other words we consider the case where the matrix T is random and 
completely defined by the random vector t = [ti, . . . , t r ]. We introduce here the parameter (3 as 
the ratio of the two-sided signal bandwidth 2M + 1 and the number of sensors r 

*=^. (8) 
r 

In the following we consider the asymptotic case where the values of M and r grow to infinity 
while (3 is kept constant. We then show that properties of systems with finite M and r are well 
approximated by the asymptotic results. 

We focus here on the expression of the probability of unsuccessful signal reconstruction, i.e., 
the probability that the reconstruction algorithm fails given the machine precision e, the signal 
bandwidth M, and the number of sensors r. For a given realization of T and for finite values of 
M and r we denote by A = [Ai, . . . , \2M+i] the vector of eigenvalues, and by A m i n = min(A) 
and A max = max(A) the minimum and maximum eigenvalues, respectively. Also let fM,/3{ x ) be 
the empirical probability density function (pdf) of the eigenvalues of T for a finite M and (3 
and let fp{x) be the limiting eigenvalue pdf in the asymptotic case (i.e., when M and r grow 
to infinity with constant (3) [6]. The random variable A m j n = min(A), and the condition number 
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k have pdf an d Im p( x )> respectively. The corresponding cumulative density functions 

(cdf) are denoted by F M ,p{x), Fp(x), F^(x), and F^ p {x). 

A. Some properties of the eigenvalue distribution 

We first analyze by Montecarlo simulation some properties of the distribution fM,p{x)- Figure [3] 
shows histograms of fM,p{x) for M = 1,4, 10,90, (3 = 0.25, and bin width of 0.1. Notice that, 
as M increases with constant (3, the histograms of fM,p{x) seem to converge to fp(x), only 
depending on (3. Indeed, looking at the figure, one can notice that the difference between the 
curves for M — 10 and M = 90 is negligible. Although we report in Figure [3] only the case 
for (3 = 0.25, we observed the same behavior for any value of (3. We therefore conclude that 
M = 10 is large enough to provide a good approximation of fp(x). 

In Figure |4] we show histograms of Jm,p{x) for (3 = 0.15,0.25,0.35,0.45,0.55 and values of 
M around 100. For (3 larger than 0.35 the distribution shows oscillations and tends to infinity 
while x approaching 0. On the other hand, for (3 lower than 0.35 the pdf does not oscillate and 
tends to while x approaching 0. In order to better understand this behavior for small x, which 
can be heavily affected by the bin width, in Figure [5] we consider the cdf F M ^(x) in the log-log 
scale, for various values of (3 ranging from 0.1 to 0.8 and M = 200. The dashed curves represent 
the simulated cdf. Surprisingly they show a linear behavior for small values of x and for any 
value of (3. This is evidenced by the solid lines which are the tangents to the dashed curves at 
Fm,p(x) = 10~ 2 . The slope of the lines is parameterized by (3. In our simulations the machine 
precision is approximately e = 10~ 16 and, hence, values of x < e cannot be represented since 
they are treated as zero by the algorithm. Indeed the simulated pdfs loose their linear behavior 
while approaching x = e (see the case (3 = 0.8 in Figure [5]). We conclude that for i < 1 the 
cdf Fp(x) can be approximated by 

Ff,(x) ss bx a (9) 

where a = a{(3) and b = b{(3) are both functions of (3. By deriving © with respect to x we 
obtain the approximate expression for the pdf: 

fp(x) w a{(3)b{(3)x a ^- x (10) 

From (flOl) it can be seen that the function a((3) represents the slope of Fp(x) in the log-log 
scale for i C 1. Note that in order x"^^ 1 to be integrable in [0, c), for any positive constant c, 
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the condition a{0) > should be satisfied. Note also from Figure [5] that the slope a{[3) = 1 is 
obtained for (3 ~ 0.35. For this value of (3 the approximate pdf is constant for i < 1, which is 
consistent with the results in Figure HI 

Some additional considerations can be drawn from Figure [6l which presents the pdf of fM,0(x) 
for (3 = 0.25, 0.50, 0.75 and M = 200. It is interesting to note that for any value of /3, large 
eigenvalues are less likely to appear than very small eigenvalues. This is evident by observing 
that for x 1 the pdf falls to — oo much faster than for x C 1. This consideration is of great 
relevance when discussing the condition number distribution. 

B. Distribution of the minimum eigenvalue 

For finite M the cdf of A m i n can be computed as follows 

Ftf$(x) = P(A min <x|M) 

= P(min(A) < x\M) 

In general the random variables Ai, . . . , A2M+1 are n °t independent. However, considering suf- 
ficiently large values of M (namely, M > 10), we can write the following upper bound for 

*Eg(x): 

FS${x)<{2M + i)Ff,{x). (11) 

This is obtained by assuming that the eigenvalues are independent with pdf equal to the limiting 
eigenvalue distribution. The simulation results presented in Figure [7] confirm the expression in 
(fTTI) . The figure shows the cdfs of A and A min in the log-log scale for (3 = 0.25,0.50,0.75 
and M = 40. The cdf of A m i n also shows a linear behavior for s < 1. b the log-log scale, 
according to (ITT]) , the two cdfs should be separated by log 10 (2M + 1). In our case: M = 40 
and log 10 (2M + 1) rs 1.91. As is evident from the figure, this upper bound is extremely tight, 
especially for low values of [3. 

C. Distribution of the condition number 

Here we describe the condition number distribution. The condition number is defined by ©. 
As noted at the end of Section ITV-AI the minimum eigenvalue dominates the ratio A max / A min . This 
fact is more evident in Figure [8l where we compare the distributions of the condition number 
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and of the minimum eigenvalue, for (3 = 0.25 and M = 10, 20, 40. The three dashed curves on 
the left represent the pdf of the minimum eigenvalue. The solid lines on the right represent the 
pdf of the condition number for the same values of M. The two set of distributions look very 



similar. We define y = \og 10 x, j™%y) = log 10 /^(10^) and ^{y) = log 10 f^{W). By 



where d is a parameter. In the plot, for each value of M the circles represent the above 
approximation where the parameter cZ is set to 1/3. The same considerations hold for any value 
of (5. Converting the above approximation into the linear scale, we obtain: 



and by taking the derivative of both sides of (PTTI) with respect to x, we finally obtain 



which holds for x^$> 1. 
D. Summary 

In this section we have given numerical evidence of the following facts: 

• the condition number distribution is dominated by the distribution of the minimum eigen- 
value of T; 

• the distribution of the minimum eigenvalue is upper bounded by a simple function of the 
asymptotic distribution of the eigenvalues of T. 

Thus, in the following we focus on fp(x); indeed, knowing fp(x) we could obtain the probability 
that the minimum eigenvalue is below a certain threshold, i.e., that the condition number is less 
the machine precision. 



We now derive some analytic results on the asymptotic eigenvalue distribution, fp{x). Ideally 
we would like to analytically compute fp(x), however such a calculation seems to be prohibitive. 
Therefore, as a first step we compute the closed form expression of the moments of the asymptotic 



observing the results in Figure [H the following relation holds: 





V. Some analytic results on the eigenvalue pdf 
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eigenvalue distribution, E[A P ]. Note that, if all moments are available, the an analytic expression 
of fp(x) can be derived through its moment generating function, by applying the inverse Laplace 
transform. 

In the limit for M and r growing to infinity with constant (3 the expression of E[A P ] can be 
easily obtained from the powers of T. Indeed T is an Hermitian matrix and can be decomposed 
as T = UATJt, where A = diag(A) is a diagonal matrix containing the eigenvalues of T and 
U is the matrix of eigenvectors. It follows that 

Tr{T p } = Tr{(UAU f ) P } 

= Tr{UA p U f } 

= Tr{U f UA p } 

= Tr{A p } 



2M+1 



J2 A < 



i=l 



Then: 



lim 



1 



M,r^+oo 2M + 1 
2M+l _g 



Tr{E [T p ]} 



lim 



1 



M.rw+oo 2M + 1 

2M+1 _p 



-E 



' 2M 



E 



i=0 



2AI 



lim 

M,r-»+oo 2M + 

2M+1 _f } 



+ 1 to- 



E[A P ] 



(12) 



(13) 



Please notice that since T is a Toeplitz matrix the Grenander-Szego [7] theorem could be 
employed in the limit for M — > +oo. Unfortunately in this case the theorem is not applicable 
since all entries of T depend on the matrix size M. 
From (TT3T ) and © we obtain: 

^ T. T, e 



E[A P ] 



lim 



M,r-+oo (2M + l)rP ^ ^ 
2M+i = p v ' qeQ lec 



V 



exp [ 27ri2_ / tg i (£ i - 



where 



Q = {q I q= [0i, ■ ■ ■ ,Qp], fc = l,...,r} 
C = {1 | 1= [£!,...,£„], £i = 0,...,2M} 



(14) 
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and where the sign [•] refers to the modulo p operatoj^]. The average is performed over the 
random vector t = [ti, . . . , t r \. 

Let now V be the set of integers from 1 to p 

V = {l,...,p}. (15) 

Let q G Q and let 1 < A;(q) < p be the number of distinct values assumed by the entries of q. 
Such values can be arranged, in order of appearance, in the vector q = [q l7 . . . , qk(q)} where the 
entries qj are all distinct. Using q and q we create the subsets Vi(q), . . . , "Pfc( q )(q) of V defined 
by 

P j (q) = {ieV | q i = q j }. (16) 

Such subsets are non-empty and disjoint (Vj ^ 0, U Vj = V, and Vj HVh = for j ^ h). 
Finally we define r(q) 

r(q) = {n(q),...,n(q)(q)} 
as the partition of V induced by q. 

Example 1: Let p = 6 and q = [4, 9, 5, 5, 4, 3]. Then, by V = {1,2,3, 4, 5, 6}. 
We have &;(q) = 4 distinct values which we arrange, in order of appearance, in the 
vector q = [4, 9, 5, 3]. Then 

7Mq) = {l,5} (qi = q 5 = qi), 

V 2 (q) = {2} (q 2 = q 2 ), 

P 3 (q) = {3,4} (q 3 = q 4 = q 3 ), 

VM) = {6} (? 6 = 
and r(q) = {{1,5},{2},{3,4},{6}}. 



For any given q G Q, using the definition of Vj(q), we notice that the argument of the average 
operator in (fT4l) factorizes in A;(q) parts, i.e. 

/ p \ k ^ ( 

exp I 2?ri t qi {ii - ) = JJ exp 2wit 9j ^2 ti~ 



2 For simplicity here we follow the convention [p] = p and [p + 1] = 1. 
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each depending on a single random variable t^. Then from (fl4l) we have: 

e ( a i = iim fn ^ x ^ y y e 



lim - 

M,r^+oo (2M + 1)7 

v ; qeQ ie£ 



T J. _ 

r 


=/9 


lim 






00 


2M+1 _ 


=/? 


r 




lim 






00 


2M+1 _ 


=/3 



fc(q) 

JJexp ( 2vri^ £ 



— — r — v v n e 



fe(q) 



exp J 27rit$. ^ 



[i+l] 



; q€Q ie£ 3=1 Vie7>i(q) 



(17) 



where <5(-) is the Kronecker's delta. Expression (fT7l) can be further simplified by observing that 

• there exist r(r — 1) • • • (r — k + 1) = r!/(r — A;)! vectors qGQ generating a certain given 
partition of V made of k subsets, 

• for a given q the expression 



cmi) = e n * I E 

le£ j=l ViePi(q) 



'[i+l] 



(18) 



is a polynomial in the variable 2M, since it represents the number of points with integer 
coordinates contained in the hypercube [0, . . . , 2M] P and satisfying the k(q) constraints 

^-%+i]=0 (19) 

We show in Appendix U that one of these constraints is always redundant and that the number 
of linearly independent constraints is exactly fc(q) — 1. By consequence the polynomial 
C 2 Af(q) has degree p - fc(q) + 1. 

Let T p be the set of distinct partitions of V generated by all vectors qeQ, then from (fTTT) we 

obtain: 



(a) 



(6) 



lim - 

M,r^+oo (2M + 

2M+l=f3 

r F 

1 



lim - 

M,r^+oo (2M + l)rP 



qeQ ie£ j=i \ie-p 3 (q) 



^ XI ^Af(q) 

reTp q=^T 



lim , 

M,r^+oo (2M + l)rP ^ (r - jfe(r))! 

2 A-/ +1 ^ T^lj 



E 



C2M(T) 



(20) 
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where 

• the notation X] q =^- represents the sum over all vectors q generating a certain given partition 

r, 

• the equality (a) has been obtained by substituting (fT8l) . and 

. the equality (6) holds because the number of vectors q generating a given partition r is 

r!/(r- k(r))\. 

We point out that the functions fc(q) and C2A/(q) depend only on the partition r(q) induced by 
q. Since in the third line of (l20l) we removed the dependence on the vectors q, the expression 
of E[A P ] is now function of the partitions r only. Then with a little abuse of notation, in the 
following we refer to the functions k and ( 2 M a s k(r) and (2m(t), respectively. 
Taking the limit we finally obtain: 

E[X P ] = J2 v ( T )P P ~ k{T) 

reT p 

= El E <A^ k 

k=l \r£T p , k J 

where T p ^ is the subset of T p only containing partitions of size k, and 

/ x ,. C2m(t) 

v(t) = lim . - -r — p— - 

V ; M^+oo (2M)P~ k+1 

i.e. u(r) is the coefficient of degree (2M) p ~ k+1 of the polynomial (2m(t). Since 1 < k < p 
from (|2TI) we note that E[A P ] is a polynomial in /3 of degree Again, for the sake of clarity 
we give an example: 

'Notice also that the coefficient v(t) represents the volume of the convex polytope described by the constraints l !19t when 
the variables li are considered real and limited to the interval [0, 1]. By consequence < v(t) < 1. 
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Example 2: Let p = 6 and q given by Example 1. The partition is r = 
{{1, 5}, {2}, {3,4}, {6}}. Then the set of k{r) = 4 constraints (03 are given by: 

4+4 = £ 2 + 4 
4 = 4 

4 + 4 = 4 + 4 
4 = 4 

The last equation is redundant since can be obtained summing up the first three 
constraints. Simplifying we obtain 4 = 4, an d 4 = 4 — 4- Since each variable 
4 ranges from to 2M, the number of integer solutions satisfying the constraints is 
exactly ( 2 m(t) = (2M + l) 3 , and then v(t) = 1. 



To compute (|2T|) we need to enumerate the partitions r E T p . First of all we notice that 7^ 
represents the set of partitions of a p-element set and thus has cardinality \T P \ = B(p) where 
B(p) is the p-th Bell number or exponential number [11], and that the subset T Pt k has cardinality 
Sj>,fc which is a Stirling number of the second kind [12]. An effective way to enumerate such 
partitions is to build a tree of depth p as in Figure [9] A label is given to each node, starting 
from the root which is labeled by "a". The rule for building the tree is as follows: each node 
J\f generates m + 1 leaves, labeled in increasing order starting from "a", and m is the number 
of distinct labels in the path from the root to the node J\f. The number of leaves of such a tree 
of depth p is given by B(p). Each path from the root to a leaf represents a partition r of the set 
V. For a given partition (or path in the tree) the subset Vj is the set of integers corresponding 
to the depths of the j-th label in the path. 

Example 3: Let us consider p = 4 and the path [a, b, a, a] (see Figure [9]). In the path 
there are two distinct labels, namely "a" and "b"; then k(r) = 2. The label "a" is found 
at depths 1,3, and 4, while the label "b" is at depth 2. The partition of V = {1, 2, 3, 4} is 
then given by r = {{1, 3, 4}, {2}}. This partition (or path) contributes to the expression 
of E[A P ] = E[A 4 ] with the term v(r)/3 p - k = (3 2 since in this case v(t) = 1. 
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Using the procedure described above we can derive in closed form any moment of A. Here 
we report the first few moments: 



E[A] 


= 1 


E[A 2 ] 


= l + (3 


E[A 3 ] 


= l + 3p + p 2 


E[A 4 ] 


= l + Q(3+^-f3 2 + f3 3 


E[A 5 ] 


= 1 + 10/5 + y/? 2 + f /5 3 + /5 4 



In practice the algorithm complexity prevents us from computing moments of order greater than 
p = 12. To the best of our knowledge, a closed form expression of the generic moment of A 
is still unknown. If all moments were available, then an analytic expression of fp{x) could be 
derived through its moment generating function &p(s) 

f fi (x)erdx = Y,-^r* (22) 
p=o P' 

by applying the inverse Laplace transform. 
A. Validation 

We compare the moments of A obtained by simulation with those obtained with the above 
closed form analysis. Table U compares the exact values of the moments of fp(x), and the values 
obtained by Montecarlo simulation, for (3 = 0.25, 0.50, 0.75 and p — 1, ... ,5. For each value of 
(3 the Table shows three columns. The first column, labeled "Sim" presents the values obtained 
by simulation, using M = 200. The second column, labeled "Exact", reports the values obtained 
using (fTTT) without taking the limit (i.e., using finite values of M and r). The third column, 
labeled "Limit", presents the limit values obtained through (I2TI) . The excellent match between 
simulation analytic results shows the validity of our findings. 

VI. Conclusions 

We considered a large-scale wireless sensor network sampling a physical field, and we in- 
vestigated the relationship between the network topology and the probability of successful 
field reconstruction. In the case of deterministic sensor locations, we derived some sufficient 
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TABLE I 

Comparison of the moments of A obtained by simulation and by closed form analysis for M = 200, and 

[3 = 0.25,0.50,0.75. 





(3 = 0.25 


/3 = 0.50 


/3 = 0.75 




Sim 


Exact 


Limit 


Sim 


Exact 


Limit 


Sim 


Exact 


Limit 


p=l 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


1.000 


p=2 


1.249 


1.249 


1.250 


1.499 


1.499 


1.500 


1.748 


1.748 


1.750 


p=3 


1.810 


1.810 


1.812 


2.746 


2.744 


2.750 


3.802 


3.801 


3.812 


p=4 


2.926 


2.925 


2.932 


5.778 


5.771 


5.792 


9.630 


9.620 


9.672 


p=5 


5.152 


5.152 


5.176 


13.51 


13.49 


13.56 


27.41 


27.35 


27.57 



conditions for successful reconstruction, by reviewing the literature on irregular sampling. Then, 
we considered random network topologies, and employed random matrix theory. By doing so, 
we were able to derive some conditions under which the field can be successfully reconstructed 
with a given probability. 

A great deal of work still has to be done. However, to the best of our knowledge, this work 
is the first attempt at solving the problem of identifying the conditions on random network 
topologies for the reconstruction of sensor fields. Furthermore, we believe that the basis we 
provided for an analytical study of the problem can be of some utility in other fields besides 
sensor networks. 

Appendix I 
The constraints 

Let us consider a vector of integers q of size p partitioning the set V = {1, . . . ,p} in k subsets 
Vj, 1 < j < k and the set of k constraints 

Y, 1 *- VH =°- ( 23 ) 

lev. 

We first show that one of such constraint is always redundant. 
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A. Redundant constraint 

Choose an integer j, 1 < j < k. Summing up together the constraints, except the j-th, we get 

k 

h=i iev h 
ieV/T>j 

iev ieVj 



which gives the j-th constraint 



Thus one of the constraints (TT9l) is always redundant. We now show that the remaining k — 1 
constraints are linearly independent. 

5. Linear independence 

The constraints (fT9l ), after some simplifications, can be rearranged in the form 

A1 T = 

where A is a k x p matrix and 1 = [£i, . . . , £ p ]. We have previously shown that the rank of A 
is such that 

p(A) < k - 1 (25) 

since one constraint is redundant and k < p. We prove now that the rank of A is exactly k — 1. 

It is possible to write A as A = A' — A" where (A')jj = 1 if i G Vj, and elsewhere. The 
matrix A' has rank k since its rows are linearly independent due to the fact that subsets Vj have 
empty intersection. Similarly (A")^ = 1 if [i — 1] 6 Vj, and elsewhere. In practice the matrix 
A" is the matrix A' circularly shifted by one position to the right. Hence it can be written as 

A" = A'Z 
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where Z is the px p right-shift matrix, i.e. the entries of the i-th row of Z are zeroes except for 
a "1" at position [i + 1]. By consequence 



where 



A = A' - A'Z = A'(I„ - Z) 



+1 -1 

o '•• ■-. 



(i P - z) 





-1 



+1 



has rank p(l p — Z) = p — 1. By consequence, using the property 

p(A) = p(A'(I p -Z)) 

> p(A')+p(I p -Z)-p 
= k-1 

Considering together (1251) and (T26l) we conclude p(A) = — 1. 



(26) 
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Fig. 1. Example of a reconstructed signal from irregular sampling, for r = 26, M = 10, j3 = 0.807 
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Fig. 2. Example of a badly reconstructed signal due to numerical instability for r = 21, M = 10, ft = 1 




Fig. 3. Histograms of fM,p(x) for /3 = 0.25 and increasing values of M 
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-3-2-10 1 2 3 4 

log 10 x 

Fig. 8. Histograms of and ffc^x) in the log-log scale for (3 = 0.25 and M = 10, 20, 40 




B(p) 



Fig. 9. Partitions tree 



February 1, 2008 DRAFT 



